Emerging Variants of Canine Enteric Coronavirus Associated with Outbreaks of Gastroenteric Disease

A 2022 canine gastroenteritis outbreak in the United Kingdom was associated with circulation of a new canine enteric coronavirus closely related to a 2020 variant with an additional spike gene recombination. The variants are unrelated to canine enteric coronavirus–like viruses associated with human disease but represent a model for coronavirus population adaptation.

on a GridION sequencing device for 72 hours using MinKNOW (Version 3.6.5)with live base calling disabled.

Amplicon tiling
The version 2 primer scheme which was based on a preliminary SISPA genome, were as described by Quick et al. (2).For details of primer schemes, reference genomes and version 1 (pilot) data, see https://github.com/edwardcunningham-oakes/CECoV-outbreak-2022.Library preparation of targeted PCR amplicons was adapted from the ncov-2019 sequencing v3 (ARTIC) protocol by Josh Quick (https://www.protocols.io/view/ncov-2019-sequencing-protocol-v3locost-bh42j8ye) with slight modifications as follows: PCR product (amplicon size 1200bp) was normalized to 100ng per sample for end-preparation and adaptor ligation steps.The ONT Native Barcoding Ligation kit (EXP-NBD196) was used for multiplexing no more than 30 samples per flow cell.All samples were sequenced on FLO-MIN106D flow cells (R9.4.1 chemistry) on a GridION sequencing device for 72 hours using MinKNOW (Version 3.6.5)with live basecalling disabled.

Bioinformatics
Basecalling of SISPA and amplicon tiling Fast5 files was undertaken using Guppy v4.2.2.Outputted FASTQ files were demultiplexed using PoreChop v0.2.4 (3) and quality filtered/primer trimmed using Nanofilt version 2.8.0 (size selection: 150-1500bp (SISPA), 1000-1400bp (amplicon-tiling); Average Q score: ≥15; Head and tail trimming: 18 bases (SISPA), 27 bases (amplicon-tiling)) (4).Filtered and trimmed SISPA FASTQ files were uploaded to the online BugSeq v1 portal https://bugseq.comupload date: Mar 29, 2022) for metagenomic classification using the RefSeq database (5) with classification results summarized and viewed in Recentrifuge (6).Reads classified as alphacoronavirus were extracted using seqtk version 1.3-r106 (https://github.com/lh3/seqtk).BLASTn was then used on a subset of the SISPA and amplicon-tiling reads to find the nearest Genbank hit to use as a reference for mapping reads to build a consensus.The mpileup option in Samtools version 1.15 (7) was used to build a consensus with areas of less than 5X coverage masked.For one genome (Dog 61/22) which had several low coverage areas (<5X depth), amplicons from the V1 amplicon tiling scheme were incorporated to generate a consensus sequence.If reference genomes were not sufficient for mapping the divergent Spike gene, a custom BLAST database of canine coronaviruses spike genes was used to map reads to get a consensus of the S gene to be combined with the draft genome.
Gubbins was used to assess recombination, based on genome alignments and phylogenetic relationships.Single nucleotide polymorphisms (SNPs) were identifed in aligned whole-genome sequence data of related genomes, and used to constructs an initial phylogenetic tree (RAxML.GTRCAT as the model).Regions of putative recombination were defined by using a minimum threshold of 3 SNPs (default).
In addition, and as alphacoronavirus spike genes have been shown to undergo intragenic recombination, phylogenies of the different domains of S genes generated in this study and other alphacoronaviruses (Feline/Canine coronaviruses I and II and Transmissible gastroenteritis virus) were generated using the above methods with slight modifications as follows: Pal2NAL (8) was used to ensure accurate codon alignments and the GTR+F+G4 model was used for tree construction.

Temporal analysis of Major Presenting Complaint (MPC) data
During our period of interest, the presence of social distancing (lockdown) restrictions due to COVID-19 had a marked effect on the apparent prevalence of MPCs.The cancellation of routine consults (e.g., vaccinations and health checks) reduced Nt for weeks during which social distancing restrictions applied, though emergency consults for gastroenteric disease appear to still have taken place (9).The result of this was to increase the apparent prevalence of MPC during the affected weeks.To capture this effect, we introduced a dummy variable, zt taking the value 1 if week t was affected by social distancing and 0 otherwise (10).
We then let where  is the mean log odds of an MPC consult,  represents a linear time trend capturing long-term drift in MPC prevalence (the effect on the log odds of MPC for a 1 week increase in time),  represents an offset in the log odds for MPC for weeks in which social distancing was imposed, and ut represents a time-varying random effect.
The random effect ut allows us to model periodic serial correlation in our weekly observations, as well as any extra-Binomial variation that might contribute to the overall variability of cases from 1 week to the next.We model the vector u as a Gaussian process with mean 0 and covariance matrix  The model was cast in a Bayesian setting, for which prior distributions were used for all unknown parameters , , , ,  (Table 1).
The model was fitted using the No-U Turn Sampler implementation in the Python package PyMC3 for 6000 iterations with a 1000 iteration burn in period, to provide numerical estimates of the joint posterior distribution (11).
The advantage of Bayesian inference in our context is that it allows us to compare our observations yt, t = 1,...,174, with the filter distribution For each observation, we calculate highlighting timepoints where 0.95<qt<0.99as possibly higher than expected, and where qt>0.99 as likely to be higher than expected.Conversely, we identify cases where 0.025 <qt <0.05 and qt <0.025 as possibly or likely to be lower than expected respectively.
2 such thatThe covariance matrix  2 captures the correlation between two variates ut and us spaced s-t weeks apart, and we assume the correlation follows a periodic function Here,  2 represents the variance between two timepoints spaced a year apart,  represents the lengthscale of the correlation (essentially how correlated any two adjacent timepoints are), and  2 represents extra-Binomial variability due to observation error.For identifiability reasons, we fix  = 0.32year -1 , tuned manually to give a satisfactory amount of smoothing over the timeseries.